-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OGRLayer::GetArrowStream(): add a DATETIME_AS_STRING=YES/NO option #11213
base: master
Are you sure you want to change the base?
Conversation
55352c2
to
ae77c2f
Compare
DATETIME_AS_STRING=YES/NO. Defaults to NO. Added in GDAL 3.11. Whether DateTime fields should be returned as a (normally ISO-8601 formatted) string by drivers. The aim is to be able to handle mixed timezones (or timezone naive values) in the same column. All drivers must honour that option, and potentially fallback to the OGRLayer generic implementation if they cannot (which is the case for the Arrow, Parquet and ADBC drivers). When DATETIME_AS_STRING=YES, the TIMEZONE option is ignored. Fixes geopandas/pyogrio#487
ae77c2f
to
ad41bb8
Compare
@theroggy @jorisvandenbossche I'm thinking that in this DATETIME_AS_STRING=YES mode, in the ArrowSchema of datetime fields exposed as string (format='u'), we should probably also set the metadata field with a hint for the DateTime semantics. Any suggestion of an appropriate value for it? |
Thanks a lot for looking into this!
Would you just want to indicate that the original GDAL/OGR type was a DateTime? Or is there more information about the column that GDAL can know at that point? |
actually, I'm just remembering that we have already something. https://gdal.org/en/latest/doxygen/classOGRLayer.html#a3ffa8511632cbb7cff06a908e6668f55 mentions:
Those are only filled when they cannot be expressed with an Arrow concept. |
…:OGR:type":"DateTime" metadata in the ArrowSchema of DateTime fields
…e when ArrowSchema.format='u' (string)
…ETIME_AS_STRING to preserve origin timezone Fixes OSGeo#11212
When DATETIME_AS_STRING=YES, the TIMEZONE option is ignored.
Fixes geopandas/pyogrio#487
OGRLayer::GetArrowStream(): when DATETIME_AS_STRING=YES, expose "GDAL:OGR:type":"DateTime" metadata in the ArrowSchema of DateTime fields
CreateFieldFromArrowSchema(): take into account GDAL:OGR:Type=DataTime when ArrowSchema.format='u' (string)
ogr2ogr: GPKG/FlatGeoBuf -> other format: in Arrow code path, use DATETIME_AS_STRING to preserve origin timezone
Fixes #11212